Using Tree Automata for XML Mining and Web Mining with Constraints
نویسندگان
چکیده
Most work on pattern mining focus on simple data structures like itemsets or sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structure, XML and Web Log databases and social network, require much more sophisticated data structures (trees or graphs) for their specification. Here, interesting patterns involve not only frequent object values (labels) appearing in the graphs (or trees) but also frequent specific topologies found in these structures. In a recent work, we have introduced the method CobMiner for tree pattern mining with constraints. CobMiner uses tree automata as a mechanism to specify user constraints over tree patterns. In this paper, we focus on applications of this method in two different contexts: XML Mining and Web Usage Mining. An extensive set of experiments executed over real data in these two fields allow us to conclude that CoBMiner is an efficient method for mining datasets whose elements are specified as trees.
منابع مشابه
Tree pattern mining with tree automata constraints
Most work on pattern mining focuses on simple data structures such as itemsets and sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structures, XML and Web log databases and social networks, require much more sophisticated data structures such as trees and graphs. In these contexts, interesting patterns involve not only freq...
متن کاملConstraint-based Tree Pattern Mining
Most work on pattern mining focus on simple data structures like itemsets or sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structure, XML and Web Log databases, social network, require much more sophisticated data structures (trees or graphs) for their specification. Here, interesting patterns involve not only frequent ob...
متن کاملOptimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining
The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...
متن کاملDevelopment of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism
Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...
متن کاملDevelopment of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism
Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...
متن کامل